Acute graft versus host disease (aGVHD) is a complication seen following allogeneic hematopoietic cell transplantation (allo-HCT) that contributes to significant morbidity and mortality. To our knowledge, no validated bedside model for predicting the risk of developing aGVHD following allo-HCT is currently available. Secondly, whether a machine learning based risk score can achieve superior performance compared to traditional statistical models is currently unknown. We aimed to develop and validate a clinical risk score to identify patients with significantly different risk for developing aGVHD grades 2-4 and 3-4 by day 100 post-transplant. Additionally, we compared the performance of the machine learning technique Bayesian additive regression trees (BART) with traditional logistic regression model.

This analysis included adult patients who underwent allo-HCT between 2008 and 2019. Eligibility criteria were inclusive of a wide range of transplant indications, donor types, graft types, conditioning regimens, and GVHD prophylaxis regimens. The final cohort included 21,796 patients and was randomly split into training and validation sets consisting of 15,258 (70%) and 6,538 (30%) patients respectively. The training and validation sets were highly comparable across key characteristics. The most common categories for several key characteristics are shown in Table 1.

Patient-related variables analyzed included age, race and ethnicity, Karnofsky performance status, and hematopoietic cell transplantation-specific comorbidity index. Disease-related factors included disease type, cytogenetics, and disease status at allo-HCT. Transplant-related variables included conditioning regimen, donor and graft type, donor age (unrelated donors only), donor-recipient ABO matching, donor-recipient sex matching, donor-recipient CMV serostatus, GVHD prophylaxis, and in-vivo T-cell depletion status.

The primary outcome was aGVHD grade 2-4 and the secondary outcome was aGVHD grade 3-4, both summarized as event rate by day 100 post-transplant. Models were created using the training set. Logistic regression with a stepwise selection procedure was used to select prognostic factors for each outcome. Weighted scores were assigned to variables associated with each outcome based on the magnitude of their odds ratios. Risk scores were categorized into four groups using the 25th, 50th and 75th percentiles from the training set as cutting points, and association of the risk scores was tested using the independent validation cohort. Using the same training and validation sets, BART model was implemented to fit the training data and make predictions on the validation data.

From logistic regression, the odds of developing aGVHD 2-4 by day 100 post-transplant were 1.50 (95% confidence intervals [CI] 1.29-1.75, p <.0001) for the 25th to 50th percentile group, 2.0 (95% CI 1.78-2.40, p <.0001) for the 50th to 75th percentile group, and 3.1 (95% CI 2.72-3.65, p <.0001) for the >75th percentile group when compared to the ≤25th percentile group in the validation cohort. The adjusted day-100 probability of aGVHD 2-4 was 26% in the ≤ 25th percentile group and 53% in the >75th percentile group. The odds of developing aGVHD 3-4 by day 100 post-transplant were 1.4 (95% CI 1.11-1.74, p=0.0043) in the 25th to 50th percentile group, 2.0 (95% CI 1.61-2.49, p <.0001) in the 50th to 75th percentile group, and 3.2 (95% CI 2.64-3.98, p <.0001) in the >75th percentile group when compared to the ≤ 25th percentile group in the validation cohort. The adjusted day-100 probability of aGVHD 3-4 was 9% in the ≤25th percentile group and 24% in the >75th percentile group. Cumulative incidence of aGVHD 2-4 and 3-4 in the training and validation sets using a stratified Fine-Gray model are shown in Figure 1. When comparing the performance of BART based models to logistic regression-based models using concordance index, there were no significant differences for aGVHD 2-4 (p=0.078) or aGVHD 3-4 (p=0.99).

Here we have created the first validated clinical risk score for the risk of developing aGVHD 2-4 and 3-4 following allo-HCT, with wide eligibility criteria and inclusiveness. This risk score can guide personalized clinical decision making, patient counseling, and design of clinical trials for testing novel prophylactic interventions. Finally, the BART-based models showed similar performance to the traditional statistical-based models.

Pidala:Incyte: Consultancy, Membership on an entity's Board of Directors or advisory committees; Amgen: Consultancy, Membership on an entity's Board of Directors or advisory committees; Regeneron: Consultancy, Membership on an entity's Board of Directors or advisory committees; Novartis: Research Funding; Takeda: Research Funding; Johnson and Johnson: Research Funding; Pharmacyclcis: Research Funding; Abbvie: Research Funding; CTI Biopharma: Consultancy, Membership on an entity's Board of Directors or advisory committees; Syndax: Consultancy, Membership on an entity's Board of Directors or advisory committees; BMS: Research Funding. Kitko:Horizon Therapeutics: Consultancy. Lee:National Marrow Donor Program: Membership on an entity's Board of Directors or advisory committees; Janssen: Other: Provision of study medication; Amgen, AstraZeneca, Incyte, Kadmon, Novartis, Pfizer, Syndax, Takeda: Research Funding; Mallinckrodt, Equillium: Consultancy.

Author notes

*

Asterisk with author names denotes non-ASH members.

Sign in via your Institution